711 research outputs found
SGAN: An Alternative Training of Generative Adversarial Networks
The Generative Adversarial Networks (GANs) have demonstrated impressive
performance for data synthesis, and are now used in a wide range of computer
vision tasks. In spite of this success, they gained a reputation for being
difficult to train, what results in a time-consuming and human-involved
development process to use them.
We consider an alternative training process, named SGAN, in which several
adversarial "local" pairs of networks are trained independently so that a
"global" supervising pair of networks can be trained against them. The goal is
to train the global pair with the corresponding ensemble opponent for improved
performances in terms of mode coverage. This approach aims at increasing the
chances that learning will not stop for the global pair, preventing both to be
trapped in an unsatisfactory local minimum, or to face oscillations often
observed in practice. To guarantee the latter, the global pair never affects
the local ones.
The rules of SGAN training are thus as follows: the global generator and
discriminator are trained using the local discriminators and generators,
respectively, whereas the local networks are trained with their fixed local
opponent.
Experimental results on both toy and real-world problems demonstrate that
this approach outperforms standard training in terms of better mitigating mode
collapse, stability while converging and that it surprisingly, increases the
convergence speed as well
Occam's hammer: a link between randomized learning and multiple testing FDR control
We establish a generic theoretical tool to construct probabilistic bounds for
algorithms where the output is a subset of objects from an initial pool of
candidates (or more generally, a probability distribution on said pool). This
general device, dubbed "Occam's hammer'', acts as a meta layer when a
probabilistic bound is already known on the objects of the pool taken
individually, and aims at controlling the proportion of the objects in the set
output not satisfying their individual bound. In this regard, it can be seen as
a non-trivial generalization of the "union bound with a prior'' ("Occam's
razor''), a familiar tool in learning theory. We give applications of this
principle to randomized classifiers (providing an interesting alternative
approach to PAC-Bayes bounds) and multiple testing (where it allows to retrieve
exactly and extend the so-called Benjamini-Yekutieli testing procedure).Comment: 13 pages -- conference communication type forma
Multi-Modal Mean-Fields via Cardinality-Based Clamping
Mean Field inference is central to statistical physics. It has attracted much
interest in the Computer Vision community to efficiently solve problems
expressible in terms of large Conditional Random Fields. However, since it
models the posterior probability distribution as a product of marginal
probabilities, it may fail to properly account for important dependencies
between variables. We therefore replace the fully factorized distribution of
Mean Field by a weighted mixture of such distributions, that similarly
minimizes the KL-Divergence to the true posterior. By introducing two new
ideas, namely, conditioning on groups of variables instead of single ones and
using a parameter of the conditional random field potentials, that we identify
to the temperature in the sense of statistical physics to select such groups,
we can perform this minimization efficiently. Our extension of the clamping
method proposed in previous works allows us to both produce a more descriptive
approximation of the true posterior and, inspired by the diverse MAP paradigms,
fit a mixture of Mean Field approximations. We demonstrate that this positively
impacts real-world algorithms that initially relied on mean fields.Comment: Submitted for review to CVPR 201
Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection
People detection in single 2D images has improved greatly in recent years.
However, comparatively little of this progress has percolated into multi-camera
multi-people tracking algorithms, whose performance still degrades severely
when scenes become very crowded. In this work, we introduce a new architecture
that combines Convolutional Neural Nets and Conditional Random Fields to
explicitly model those ambiguities. One of its key ingredients are high-order
CRF terms that model potential occlusions and give our approach its robustness
even when many people are present. Our model is trained end-to-end and we show
that it outperforms several state-of-art algorithms on challenging scenes
Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition
We present a unified framework for understanding human social behaviors in
raw image sequences. Our model jointly detects multiple individuals, infers
their social actions, and estimates the collective actions with a single
feed-forward pass through a neural network. We propose a single architecture
that does not rely on external detection algorithms but rather is trained
end-to-end to generate dense proposal maps that are refined via a novel
inference scheme. The temporal consistency is handled via a person-level
matching Recurrent Neural Network. The complete model takes as input a sequence
of frames and outputs detections along with the estimates of individual actions
and collective activities. We demonstrate state-of-the-art performance of our
algorithm on multiple publicly available benchmarks
Learning Interpolations between Boltzmann Densities
We introduce a training objective for continuous normalizing flows that can
be used in the absence of samples but in the presence of an energy function.
Our method relies on either a prescribed or a learnt interpolation of
energy functions between the target energy and the energy function of a
generalized Gaussian . The interpolation of energy
functions induces an interpolation of Boltzmann densities and we aim to find a time-dependent vector field that
transports samples along the family of densities. The condition of
transporting samples along the family is equivalent to satisfying the
continuity equation with and . Consequently, we
optimize and to satisfy this partial differential equation. We
experimentally compare the proposed training objective to the reverse
KL-divergence on Gaussian mixtures and on the Boltzmann density of a quantum
mechanical particle in a double-well potential.Comment: TML
- …